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One of the principal statistical features characterizing the activity in financial markets is the 
distribution of fluctuations in market indicators such as the index. While the developed stock 
markets, e.g., the New York Stock Exchange (NYSE) have been found to show heavy-tailed return 
distribution with a characteristic power-law exponent, the universality of such behavior has been 
debated, particularly in regard to emerging markets. Here we investigate the distribution of several 
indices from the Indian financial market, one of the largest emerging markets in the world. We 
have used tick-by-tick data from the National Stock Exchange (NSE), as well as, daily closing data 
from both NSE and Bombay Stock Exchange (BSE). We find that the cumulative distributions of 
index returns have long tails consistent with a power-law having exponent a ~ 3, at time-scales of 
both 1 min and 1 day. This "inverse cubic law" is quantitatively similar to what has been observed 
in developed markets, thereby providing strong evidence of universality in the behavior of market 
fluctuations. 
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I. INTRODUCTION 

Financial markets can be viewed as complex systems 
with a large number of interacting components that are 
subject to external influences or information flow. Physi- 
cists are being attracted in increasing numbers to the 
study of financial markets by the prospect of discover- 
ing universalities in their statistical properties [l], 0, Q . 
This has partly been driven by the availability of large 
amounts of electronically recorded data with very high 
temporal resolution, making it possible to study various 
indicators of market activity. Among the various candi- 
dates for market-invariant features, the most widely stud- 
ied are the distributions of fluctuations in overall market 
indicators such as market indices. 

To study these fluctuations such that the result is in- 
dependent of the scale of measurement, we define the 
logarithmic return for a time scale At as, 



R(t, At) = In I(t + At) - In I(t) , 



(1) 



where I(t) is the market index at time t and At is the 
time-scale over which the fluctuation is observed. Mar- 
ket indices, rather than individual stock prices, have been 
the focus of most previous studies as the former is more 
easily available, and also gives overall information about 
the market. By contrast, individual stocks are suscep- 
tible to sector-specific, as well as, stock-specific influ- 
ences, and may not be representative of the entire mar- 
ket. These two quantities, in fact, characterize the mar- 
ket from different perspectives, the microscopic descrip- 
tion being based on individual stock price movements, 
while the macroscopic point of view focusses on the the 
collective market behavior as measured by the market 
index. 
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The importance of interactions among stocks, relative 
to external information, in governing market behavior 
has emerged only in recent times. The earliest theories 
of market activity, e.g., Bachelier's random walk model, 
assumed that price changes are the result of several in- 
dependent external shocks, and therefore, predicted the 
resulting distribution to be Gaussian Q. As an addi- 
tive random walk may lead to negative stock prices, a 
better model would be a multiplicative random walk, 
where the price changes are measured by logarithmic re- 
turns 5]. While the return distribution calculated from 
empirical data is indeed seen to be Gaussian at long time 
scales, at shorter times the data show much larger fluctu- 
ations than would be expected from this distribution [|| . 
Such deviations were also observed in commodity price 
returns, e.g., in Mandelbrot's analysis of cotton price, 
which was found to follow a Levy-stable distribution [3] ■ 
However, it contradicted the observation that the dis- 
tribution converged to a Gaussian at longer time scales. 
Later, it was discovered that while the bulk of the re- 
turn distribution for the S&P 500 index appears to be 
fit well by a Levy distribution, the asymptotic behav- 
ior shows a much faster decay than expected. Hence, 
a truncated Levy distribution, which has exponentially 
decaying tails, was proposed as a model for the distribu- 
tion of returns [8j|. Subsequently, it was shown that the 
tails of the cumulative return distribution for this index 
actually follow a power-law, 

P c {r >x)~ x - a 1 (2) 

with the exponent a ~ 3 (the "inverse cubic law") [|, 
well outside the stable Levy regime < a < 2. This 
is consistent with the fact that at longer time scales the 
distribution converges to a Gaussian. Similar behavior 
has been reported for the DAX, Nikkei and Hang-Seng 
indices [HI E| ■ These observations are somewhat sur- 
prising, although not at odds with the "efficient market 
hypothesis" in economics, which assumes that the move- 
ments of financial prices are an immediate and unbiased 
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FIG. 1: Time evolution of the National Stock Exchange of India from 1994 — 2006 in terms of (left) the total number of trades 
and (right) the total turnover (i.e., traded value). 



reflection of incoming news and future earning prospects. 
To explain these observations various multi-agent models 
of financial markets have been proposed, where the scal- 
ing laws seen in empirical data arise from interactions 
between agents (l2j ]. Other microscopic models, where 
the agents (i.e., the traders comprising the market) are 
represented by mutually interacting spins and the arrival 
of information by external fields, have also been used 
to simulate the financial market [IH, [3 [IH, [l6| ■ Among 
non-microscopic approaches, multi- fractal processes have 
been used extensively for modelling such scale invariant 
properties [U Gil • The multi- fractal random walk model 
has generalized the usual random walk model of financial 
price changes and accounts for many of the observed em- 
pirical properties [l^| . 

However, on the empirical front, there is some contro- 
versy about the universality of the power-law nature for 
the tails of the index return distribution. In the case 
of developed markets, e.g., the All Ordinaries index of 
Australian stock market, the negative tail has been re- 
ported to follow the inverse cubic law while the posi- 
tive tail is closer to Gaussian [2(|. Again, other studies 
of the Hang Seng and Nikkei indices report the return 
distribution to be exponential [2l|, [22J. For developing 
economies, the situation is even less clear. There have 
been several claims that emergent markets have return 
distribution that is significantly different from developed 
markets. For example, a recent study contrasting the 
behavior of indices from seven developed markets with 
the KOSPI index of the Korean stock market found that 
while the former exhibit the inverse cubic law, the latter 
follows an exponential distribution [23|. Another study 
of the Korean stock market reported that the index dis- 
tribution has changed to exponential from a power-law 
nature only in recent years [24| . On the other hand, the 
IBOVESPA index of the Sao Paulo stock market has been 
claimed to follow a truncated Levy distribution [25L HH . 
However, there have also been reports of the inverse cu- 
bic law for emerging markets, e.g., for the Mexican stock 
market index IPC [27[ and the WIG20 index of the Polish 
stock market (28|. A comparative analysis of 27 indices 
from both mature and emerging markets found their tail 
behavior to be similar [2911. 



Many of the studies reported above have only used 
graphical fitting to determine the nature of the observed 
return distribution. This has recently come under crit- 
icism as such methods often result in erroneous con- 
clusions. Hence, a more accurate study using reliable 
statistical techniques needs to be carried out to decide 
whether emerging markets do behave similar to devel- 
oped markets in terms of fluctuations. In this paper we 
have carried out such a study for the Indian financial 
markets. The Indian data is of unique importance in de- 
ciding whether emerging markets behave differently from 
developed markets, as it is one of the fastest growing fi- 
nancial markets in the world. A recent study of individ- 
ual stock prices in the National Stock Exchange (NSE) 
of India has claimed that the corresponding return dis- 
tribution is exponentially decaying at the tails [30| . and 
not "inverse cubic law" that is observed for developed 
markets [TJ Hl|- However, a more detailed study over 
a larger data set has established the inverse cubic law 
for individual stock prices [32l |. On the other hand, to 
get a sense of the nature of fluctuations for the entire 
market, one needs to look at the corresponding distribu- 
tion for the market index. Although the individual stock 
prices and the market index are related, it is not obvi- 
ous that they should have the same kind of distribution, 
as this relation is dependent on the degree of correlation 
between different stock price movements. While a heavy- 
tailed distribution has been reported for the Nifty index 
of NSE, it shows significant deviation from the inverse 
cubic law [33| . In this paper, we report analysis of tick- 
by-tick data for this index along with a few others that 
fully characterizes the Indian market, to conclusively es- 
tablish the nature of their fluctuation distribution. 

We focus on the two largest stock exchanges in India, 
the NSE and the Bombay Stock Exchange (BSE). NSE, 
the more recent of the two, is not only the most active 
stock exchange in India, but also the third largest in the 
world in terms of transactions |34j . We have studied the 



behavior of this market over the entire period of its exis- 
tence. During this period, the NSE has grown by several 
orders of magnitude (Fig. [1]) demonstrating its emerging 
character. In contrast, BSE is the oldest stock exchange 
in Asia, and was the largest in India until the creation 
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of NSE. However, over the past decade its share of the 
Indian financial market has fallen significantly. There- 
fore, we contrast two markets which have evolved very 
differently in the period under study. 

We show that the Indian financial market, one of the 
largest emerging markets in the world, has index fluctua- 
tions similar to that seen for developed markets. Further, 
we find that the nature of the distribution is invariant 
with respect to different market indices, as well as the 
time-scale of observation. Taken together with our pre- 
vious work on the distribution of individual stock price 
returns in Indian markets [32|, [35[ , this strongly argues in 
favor of the universality of the nature of fluctuation dis- 
tribution, regardless of the stage of development of the 
market or the economy underlying it. 



II. DATA DESCRIPTION 

Our primary data-set is that of the Nifty index of NSE 
which, along with the Sensex of BSE, is one of the pri- 
mary indicators of the Indian market. It is composed 
of the top 50 highly liquid stocks which make up more 
than half of the market capitalisation in India. We have 
used (i) high frequency data from Jan 2003 - Mar 2004, 
where the market index is recorded every time a trade 
takes place for an index component. The total number 
of records in this database is about 6.8 x 10 7 . We have 
also looked at data over much longer periods by consid- 
ering daily closing values of (ii) the Nifty index for the 
16-year period Jul 1990 - May 2006 and (iii) the Sen- 
sex index of BSE for the 15-year period Jan 1991 - May 
2006. In addition, we have also looked at the BSE 500 
index for the much shorter period Feb 1999 - May 2006. 
Sensex consists of the 30 largest and most actively traded 
stocks, representative of various sectors of BSE, while the 
BSE 500 is calculated using 500 stocks representing all 
20 major sectors of the economy. 



III. DISTRIBUTION OF INDEX RETURNS 

We first report the analysis of the high-frequency data 
for the NSE Nifty index, which we sampled at 1-min 
intervals to generate the time series I(t). From J(t) 
we compute the logarithmic return R.At(t), defined in 
Eq. (TTJ). These return distributions calculated using 
different time intervals may have varying width, ow- 
ing to differences in their volatility, defined as a\ t = 
(R 2 ) — (R) 2 , where (. . .) denotes the time average over 
the given time period. Hence, to be able to compare 
the distributions, we need to normalize the returns R(t) 
by dividing them with the volatility a At- However, this 
leads to systematic underestimation of the tail of the 
normalized return distribution. This is because, even 
when a single return R(t) is very large, the scaled return 
is bounded by y/~N, as the same large return also con- 
tributes to the variance a At- To avoid this, we remove 
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FIG. 2: (a) TP-statistic and (b) TE-statistic as function of 
the lower cut-off u for positive returns of the NSE with time 
interval At — 1 min. The broken lines indicate plus or minus 
one standard deviation of the statistics. 



the contribution of R(t) itself from the volatility, and the 
new rescaled volatility is defined as 



<7At{t) 



N 



(3) 



t'^t 



as described in Ref. [2j . The resulting normalized return 
is given by, 



r(t, At) 



R-(R) 

CAt(*) 



(4) 



Prior to obtaining numerical estimates of the distribu- 
tion parameters, we carry out a test for the nature of the 
return distribution, i.e., whether it follows a power-law 
or an exponential or neither. For this purpose we use 
a statistical tool that is independent of the quantitative 
value of the distribution parameters. Usually, it is ob- 
served that the tail of the return distribution decays at a 
slower rate than the bulk. Therefore, the determination 
of the nature of the tail depends on the choice of the lower 
cut-off u of the data used for fitting a theoretical distri- 
bution. To observe this dependence on the cut-off it, we 
calculate the TP- and TE-statistics (3(1 H3| as a function 
of u, comparing the behavior of the tail of the empirical 
distributions with power-law and exponential functional 
forms, respectively. These statistics converge to zero if 
the underlying distribution follows a power-law (TP) or 
exponential (TE), regardless of the value of the exponent 
or the scale parameter (see Appendix [SJ . On the other 
hand, they deviate from zero if the observed return dis- 
tribution differs from the target theoretical distribution 
(power-law for TP and exponential for TE). 

Fig. shows visually the deviation of the empirical 
data from the power-law and exponential distributions. 
The TP- and the TE-statistics are plotted as functions 
of the lower cut-off u for 1-min returns of the NSE Nifty 
index. The TP-statistic shows a large deviation till u < 
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FIG. 3: The cumulative distribution of the normalized 1-min 
return for the NSE Nifty index. The broken line indicates a 
power- law with exponent a = 3. 

1, after which it converges to zero indicating power-law 
behavior for large u. Correspondingly, the TE-statistic 
excludes an exponential model for u > 1 as well as for 
very low values of u, although over the intermediate range 
2 x 10" 1 < u < 6 x 10 _1 an exponential approximation 
may be possible. 

Fig. [3] shows the cumulative distribution of the nor- 
malized returns for At = 1 min. For both positive and 
negative tails, there is an asymptotic power-law behavior. 
The power-law regression fit for the region r > 2 give ex- 
ponents for the positive and the negative tails estimated 
as 

_ f 2.98 ± 0.09 (positive tail) , , 

a ~ \ 3.37 ± 0.10 (negative tail). ^ 

Note that, to avoid artifacts due to data measurement 
process in the calculation of return distribution for At < 
1 day, we have removed the returns corresponding to 
overnight changes in the index value. 

We also perform an alternative estimation of the tail 
index of the the above distribution by using the Hill es- 
timator (38|, which is the maximum likelihood estimator 
of a. For finite samples, however, the expected value of 
the Hill estimator is biased and depends crucially on the 
choice of the number of order statistics used for calcu- 
lation. We have used the bootstrap procedure to 
reduce this bias and to choose the optimal number of or- 
der statistics for calculating the Hill estimator, described 
in detail in the Appendix 151 We found a ~ 3.22 and 3.47 
for the positive and the negative tails, respectively. 

To investigate the effect of intra- day variations in mar- 
ket activity, we analyze the 1-min return time series by 
dividing it into two parts, one corresponding to returns 
generated in the opening and the closing hours of the 
market, and the other corresponding to the intermedi- 
ate time period. In general, it is known that the average 
intra-da y y olatility of stock returns follows an U-shaped 
pattern |40l . l4lj | and one can expect this to be reflected in 
the nature of the fluctuation distribution for the opening 



and closing periods, as opposed to the intervening pe- 
riod. We indeed find the index fluctuations for these two 
data sets to be different (Fig. [4]). In particular, the cumu- 
lative distribution tail for the opening and closing hour 
returns show a power-law scaling with exponent close to 
3, whereas for the intermediate period we see that the 
exponent is close to 4. This observation is similar to that 
reported for the German DAX index, where removal of 
the first few minutes of return data after the daily open- 
ing resulted in a power-law distribution with a different 
exponent compared to the intact data set [42| . 

Next, we extend our analysis for longer time scales, 
At. We find that time aggregation of the data increases 
the a value. The tail of the return distribution still re- 
tains its power-law form (Fig. [5]), until at longer time 
scales the distribution slowly converges to Gaussian be- 
havior (Table [TJ) . The results are invariant with respect 
to whether one calculates return using the sampled index 
value at the end point of an interval or the average index 
value over the interval. Figure [5] shows the cumulative 
distribution of normalized Nifty returns for time scales 
up to 60 min. However, using a similar procedure for 
generating daily returns from the tick-by-tick data would 
give us a very short time series. This is not enough for 
reliable analysis as it takes at least 3000 data points for 
a meaningful estimate of the tail index. 

For this reason, we have analyzed the daily data us- 
ing a different source, with the time period stretching 
over a considerably longer period (16 years). The re- 
turn distribution of the daily closing data of Nifty shows 
qualitatively similar behavior to the 1 min distribution. 
The Sensex index, which is from another stock exchange, 
also follows a similar distribution(Fig. [6|). The measured 
exponent values are all close to 3. This does not con- 
tradict the earlier observation that a increases with At, 
because, increasing the sample size (as has been done for 
At = 1 day) improves the estimation of a. This under- 
lines the invariance of the nature of market fluctuations 
with respect to time aggregation, interval used and dif- 
ferent exchanges. 



IV. DISCUSSION AND CONCLUSION 

The much shorter data-set of the BSE 500 daily returns 
shows a significant departure from power-law behavior, 
essentially following an exponential distribution (Figure 
not shown). This is not surprising, as looking at data 
over shorter periods can result in misidentification of the 
nature of the distribution. Specifically, the relatively low 
number of data points corresponding to returns of large 
magnitude can lead to missing out the long tail. In fact, 
even for individual stocks in developed markets, although 
the tails follow a power-law, the bulk of the return dis- 
tribution is exponential (43|. This problem arising from 
using limited data-sets might be one of the reasons why 
some studies have seen significant deviation of index re- 
turn distribution from a power-law. 



5 



10 



Opening/Closing Hours 



♦ Positive tail 
o Negative tail 



♦ o 
♦ c 



♦ o 

♦ t 



O O ♦ N 
o ♦ 



10 10 
Normalized returns 



10" 



10 



fi 10" 



E 

Z) 

O 



10" 



10 



10 



10 



Intermediate Hours 



♦ Positive tail 
o Negative tail 



8 \ 



♦ os 

♦ °\ 

♦°\ 
♦ Ox 

♦ o\ 
♦ o\ 



10 10 
Normalized returns 



FIG. 4: Intra-day variation in the cumulative distribution of the normalized 1-min return for the NSE Nifty index: return 
distribution during (left) the opening and closing hours (the broken line indicates a power law with exponent a = 3) and (right) 
the intermediate time period (the broken line indicates a power law with exponent a = 4). 



TABLE I: Comparison of the power-law exponent a of the cumulative distribution function for various index returns. Power-law 
regression fits are done in the region r > 2. The Hill estimator is calculated using the bootstrap algorithm. 

Index At Power-law fit Hill estimator 

Positive Negative Positive Negative 

Nifty ('03-'04) 1 min 2.98 ± 0.09 3.37 ± 0.10 3.22 ± 0.03 3.47 ± 0.03 
5 min 4.42 ± 0.37 3.44 ± 0.21 4.51 ± 0.03 4.84 ± 0.03 
15 min 5.58 ± 0.88 3.96 ± 0.27 6.25 ± 0.03 4.13 ± 0.04 
30 min 5.13 ±0.41 3.92 ± 0.45 5.65 ± 0.03 4.30 ± 0.03 
60 min 5.99 ± 1.52 4.42 ± 0.65 7.85 ± 0.03 5.11 ± 0.04 

Nifty ('90-'06) 1 day 3.10 ± 0.34 3.18 ± 0.28 3.33 ± 0.14 3.37 ± 0.14 

Sensex ('91-'06) 1 day 3.33 ± 0.77 3.45 ± 0.25 2.93 ± 0.15 3.84 ± 0.12 
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FIG. 5: The negative tail of the cumulative distribution of the 
NSE Nifty index returns for different time intervals At upto 
60 min. The broken line indicates a power-law with exponent 
a = 4. 



A more serious problem is that the analysis in many 
of these studies is usually performed only by graphically 
fitting the data with a theoretical distribution function. 
Such a visual judgement of the goodness of fit may lead 
to erroneous characterization of the nature of fluctuation 
distribution. Graphical procedures are often subjective, 
particularly with respect to the choice of the lower cut- 
off upto which fitting is carried out. This dependence of 
the theoretical distribution that best describes the tail 
on the cut-off, has been explicitly demonstrated through 
the use of TP- and TE-statistic in this paper. Moreover, 
recent studies have criticized the reliability of graphical 
methods by showing that least square fitting for estimat- 
ing the power-law exponent tends to provide biased esti- 
mates, while the maximum likelihood method produces 
more accurate and robust estimates Q EH . So we have 
used the Hill estimator to determine the tail exponents. 

If the individual stocks follow the inverse cubic law, 
it would be reasonable to suppose that the index, which 
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APPENDIX A: TP-STATISTIC AND 
TE-STATISTIC 

1. TP-statistic 
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FIG. 6: The cumulative distribution of the normalized 1-day 
return for the NSE Nifty and BSE Sensex index. The broken 
line indicates a power-law with exponent a = 3. 



is a weighted average of several stocks, will also behave 
similarly, provided the different stocks move in a corre- 
lated fashion [13]. As the price movements of stocks in 
an emerging market are even more correlated than in de- 
veloped markets [i||, it is expected that the returns for 
stock prices and the index should follow the same dis- 
tribution. Therefore, the demonstration of the inverse 
cubic law for the index fluctuations in the Indian market 
is consistent with our previous study [32| showing that 
the individual stock prices in this market follow the same 
behavior. 

On the whole, our study points out the remarkable ro- 
bustness of the nature of the fluctuation distribution for 
Indian market indices. While, in the period under study, 
the NSE had begun operation and rapidly increased in 
terms of activity, the BSE had existed for a long time 
prior to this period and showed a significant decrease in 
market share. However, both showed very similar fluctu- 
ation behavior. This indicates that, at least in the Indian 
context, the distribution of returns is invariant with re- 
spect to markets. The fact that the distribution is quan- 
titatively same as developed markets, implies that it is 
also probably independent of the state of the economy. 
In addition, our observation that the intra-day return 
distribution of Indian market index show properties sim- 
ilar to that reported for developed markets, suggest that 
even at this level of detail the fluctuation behavior of the 
two kinds of markets are rather similar. Therefore, our 
results indicate that although markets may differ from 
each other in terms of (i) the details of their compo- 
nents, (ii) the nature of interactions and (iii) their sus- 
ceptibility to news from outside the market, there may be 
universal mechanisms responsible for generating market 
fluctuations as indicated by the observation of invariant 
properties. The rigorous demonstration of such a univer- 
sal law for market behavior is significant for the physics 
of strongly interacting complex systems, as it suggests 
the existence of robust features that are independent of 
individual details of different systems. 



Consider the power-law distribution, 

F(x) = 1 - P c (x) = 1 - (u/x) a , for x > u, (Al) 

where u is the lower cut-off, and a is the power-law expo- 
nent for the distribution. For a finite sample x\, . . . , x n , 
the TP-statistic, TP(ii, . . . ,x n ), is defined such that it 
converges to zero asymptotically for large n [H, H3] ■ If 
the underlying distribution for a sample differs from the 
power-law form given in Eq. (|Al[) . TP is seen to devi- 
ate from zero. This statistic is based on the first two 
normalized statistical log-moments of the power-law dis- 
tribution, 
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where, E[z] represents the mathematical expectation of 
z. The TP-statistic is then defined as 
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which tends to zero as n — - > oo. The estimation of the 
standard deviation for the TP statistic is provided by the 
standard deviation of the sum 
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2. TE-Statistics 

Consider the exponential distribution, 

F(x) = 1-P c (ar) = l-exp(-(x-«)/d), for x > u, (A6) 

where u is the lower cut-off, and d(> 0) is the scale 
parameter of the distribution. For a finite sample 
xi,...,x„, the TE statistic, TE(xi, . . . , x„), is defined 
such that it converges to zero asymptotically for large n 
[33 | . If the underlying distribution for a sample differs 
from the exponential form given in Eq. (|A6[) . TE is seen 
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to deviate from zero. This statistic is based on the first 
two normalized statistical (shifted) log-moments of the 
exponential distribution, 
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where 7 = 0.577215 is the Euler constant, and 
E 2 = E 



log 2 (--l 
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As before, E[. . . ] denotes the mathematical expectation. 
The TE-statistic is then defined as 
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which tends to zero as n — > 00. The estimation of the 
standard deviation for the TE-statistic is provided by the 
standard deviation of the sum 



fe=i 



(A10) 



APPENDIX B: HILL ESTIMATION OF THE TAIL 
EXPONENT a 

The Hill estimator gives consistent estimate of the 
tail exponent a from random samples of a distribution 
with an asymptotic power-law form. For our analysis, 
we arrange the returns in decreasing order such that 
r% > ■ ■ ■ > r n . Then the Hill estimator (based on the 
largest k + 1 values) is given as 



1 



rk+i 



(Bl) 



for k — 1, • • • , n — 1. The estimator 7fc !rl — > ct^ 1 when 
fc — ^ 00 and fc/n — ► 0. However, for a finite time series, 
the expectation value of the Hill estimator is biased, i.e., 
it will consistently over or underestimate a. Further, 7 
depends critically on our choice of k, the order statistics 
used to compute the Hill estimator. 

If the form of the distribution function from which the 
random sample is chosen is known, then the bias and 
the stochastic error variance of the Hill estimator can 
be calculated. From this, the optimum k value can be 



obtained such that the asymptotic mean square error of 
the Hill estimator is minimized. Increasing k reduces the 
variance because more data are used, but increases the 
bias because the power-law is assumed to hold only in 
the extreme tail. Unfortunately, the distribution for the 
empirical data is not known and hence this procedure 
has to be replaced by an asymptotically equivalent data 
driven process. 

One such method is subsample bootstrap method. 
This method can be used to estimate an optimal number 
for the order statistics (k) that will reduce the asymptotic 
mean square error of the Hill estimator. However, this 
process requires the choice of certain parameters, e.g., 
the subsample size n s and the range of k values in which 
one searches for the minimum of the bootstrap statis- 
tics. We briefly describe this procedure below; for details 
and mathematical validation of this procedure, please see 
Ref. [H. 

We assume the underlying empirical distribution func- 
tion to be heavy-tailed, viz., 



Pc(x) 



(B2) 



with a, (3, a > and —00 < b < 00. We first calculate an 
initial 70 = 7fc , n for the original series with a reasonably 
chosen (but non-optimal) ko. Then we choose various 
subsamples of size n s randomly from the original series, 
which are orders of magnitude smaller then n. The quan- 
tity 70 is a good approximation of subsample a -1 , since 
the error in 7 is much larger for n s than for n observa- 
tions. The optimal order statistics k s for the subsample 
is found by computing j(k s , n s ) for different values of k s 
and then minimising the deviation from 70. Given k s , 
the suitable full sample k can be found by using 



k = k s [ 

n.« 



(B3) 



Here the initial estimate of a is taken to be l/7n- Further, 
we have considered (3 — a, as done by Hall [47]], although 
the results are not very sensitive to the choice of (3. Once 
k is calculated, the final estimate of the tail index is given 
by a = 1/Tk, n - 

For calculating the initial 70 we have chosen ko to be 
0.5% of the sample size n. 1000 subsamples, each of size 
n s — n/40, are randomly picked from the full data set. 
To obtain optimal k s , we confine ourselves to 4% of the 
subsample size n s . To find the stochastic error in our 
estimation of a, we have computed the 95% confidence 
interval as given by ±1.96[l/(a 2 m)] 1/2 . Although a Jack- 
knife algorithm can also be used to calculate this error 
bound, the results obtained using this method will be 
close to that obtained using the bootstrap method over 
many realizations [39], as we have done in this paper. 
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